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TTT4110: Information & Signal theory 


Note: Analysis and processing of signals that carry information. 
Representation of signals in time and frequency domain. 


Instructor: Bojana Gajic Scientific Assistants: Sebastien de la Ketthulle, 
Anders Gjendemsjg Course Webpage: NINU TTT4110 


Welcome to TTT4110: Information & Signal Theory Connexions pages. 
At these pages we will present the following topics: 


¢ Signals 

e Convolution 

e the Sampling Theorem 

e Basic Information Theory 

e Filters 

¢ Decibel with DSP applications 


The material in these pages are partly based on the book Representing 
Information by Signals,4th edition, by Tor Ramstad. 


Introduction 


To describe signals and to understand that signals can carry information we 
need tools for mathematical description and manipulation of signals. 


In this chapter we introduce several important signals and show simple 
methods of describing them. Depending on which type of signals we are 
looking at, it will be different methods availiable for manipulating them. 
The elementary operations for manipulating signals and sequences will be 
described. 

Contents of this chapter 


e Introduction (current module) 
e Discrete time signals 

¢ Analog signals 

e Discrete vs Analog signals 

e Energy & Power 

e Exercises 


The simplest signals are one-dimensional and what follows is a 
classification of them. 


Classification of signals 


Analog signals 


An analog signal is a continuous function of a continuous variable. 
Referring to [link], this corresponds to that both the 1st AND the 2nd axis is 
continuous. The 1st axis will in general correspond to the variable ¢, 
meaning time. In this context we define 


e signal range - the possible amplitude values the signal can take 
e signal axis - the time interval for which the signal exists 


2nd axis 


lst axis 


Reference axes 


Time discrete signals 


A time discrete signal is a continuous signal of a discrete variable. 
Referring to [link], we have the 1st axis discrete while the 2nd axis is 
continuous. Often we assign the values of the 1st axis to a variable n. Time 
discrete signals often originate from analog signals being sampled. More on 
that in the Sampling theorem chapter. 


x(n) 
Oo 
om 


Time discrete signal 


Note that the signal is only defined for integer values along the 1st axis. We 
do not have any information other than the values at index points. 


Digital signals 


Let the signal be a discrete function of a discrete variable, e.g. 1st and 2nd 
axis discrete, then the signal will be digital. Examples of digital signals are 
a binary sequence. Digital signals often arise from sampling analog signals 
and the samples being assigned to a discrete value. 


Periodic vs non periodic signals 


All the signals mentioned above can be periodic. For time discrete and 
digital signals one has to be extra cautious when "declaring" periodicity as 


signal with period J and an aperiodic signal. 


(Figures by Melissa Selik) 


Periodic signal 


Aperiodic signal 


Matlab file 
time _discrete.m 
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Discrete time signals 


The signals and relations presented in this module are quite similar to those 
in the Analog signals module. So do compare and find similarities and 
differences! 


Sequences 


Generally a time discrete signal is a sequence of real or complex numbers. 
Each component in the sequence is identified by an index: ...x(n-1),x(n), 
x(F 1); 2.0 


Example: 
[x(n)] = [0.5 2.4 3.2 4.5] is a sequence. Using the index to identify a 
component we have x(0) = 0.5, 2(1) = 2.4 and so on. 


Manipulating sequences 


e AdditionAdd individually each component with similar index 

e Multiplication by a constantMultiply every component by the 
constant 

¢ Multiplication of sequencesMultiply each component individually 

e DelayA delay by k implies that we shift the sequence by k. For this to 
make sense the sequence has to be of infinite length. 


Example: 

Given the sequences [x(n)] = [0.5 2.4 3.2 4.5] and [y(n)] = [0.0 2.2 7.2 
Spek 

a)Addition. [z(n)]=[x(n)]+[y(n)]=[0.5 4.6 10.4 10.0] 

b)Multiplication by a constant c=2. [w(n)]= 2 *[x(n)] = [1.0 4.8 6.4 9.0] 


Elementary signals & relations 


The unit sample 


The unit sample is a signal which is zero everywhere except when its 
argument is zero, then it is equal to 1. Mathematically 


1 if n=0 
0 otherwise 


Note: 6(n) = ‘ 


The unit sample function is very useful in that it can be seen as the 
elementary constituent in any discrete signal. Let x(n) be a sequence. Then 
we can express z(7) as follows (using the unit sample definition and the 
delay operation) 

Equation: 


The unit step 


The unit step function is equal to zero when its index is negative and equal 
to one for non-negative indexes, see [link] for plots. 


1 if n>0 
0 otherwise 


Note: u(n) = ‘ 


Two unit step functions. 


5 5 
n 


Unit step function, delayed by 5. 


Trigonometric functions 


The discrete trigonometric functions are defined as follows. n is the 


sequence index and w is the angular frequency. w = 27f, where f is the 
digital frequency. 


Note: x(n) = sin(wn) 


Note: x(n) = cos(wn) 


A discrete sine with digital frequency 1/20. 


The complex exponential function 


The complex exponential function is central to signal processing and some 
call it the most important signal. Remember that it is a sequence and that 


2 = Vv —1 is the imaginary unit. 


Note: x(n) = e”” 


Euler's relations 


The complex exponential function can be written as a sum of its real and 
imaginary part. 

Equation: 

tw 


x(n) = e“” = cos(wn) + isin(wn) 


By complex conjugating [link] and add / subtract the result with [link] we 
obtain Euler's relations. 


Note: cos(wn) = ——3—— 


Note: sin(wn) = 


The importance of Euler's relations can hardly be stressed enough. 


Matlab files 


unit step discrete.m 
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Analog signals 
The signals signals and relations presented in this module are quite similar 


to those in the Discrete time signals module. So do compare and find 
similarities and differences! 


Manipulating signals 
Mathematical operations on analog signals are unambiguous. We require 


that the signals are defined over the same time interval when using 
operations such as addition, multiplication, division and so on. 


Elementary signals & relations 


The (Dirac) delta function 


The delta function is a peculiar function that has zero duration, infinite 
height, but still unit area! Mathematically we have the following two 
properties 


Note: 6(t) = 0 fort 4 0 


Note: f°. 6(t) dt =1 


The delta function has a useful property, namely the sampling property. 
Equation: 


At this stage this may seem not particulary useful, so for now just convince 
yourself that the above relation holds. 


(We assume that x(t) is "well behaved" at t = 7, that is continuous and 
finite.) 


The unit step function 


The unit step function is equal to zero when its argument is negative and 
equal to one for non-negative arguments, see [link] for plots. 


1if t>0 
0 otherwise 


Note: u(t) = { 


Two unit step functions. 


Unit step function, no delay. 


Unit step function, delayed by 5. 


Trigonometric functions 


The trigonometric functions are central to signal processing and 
telecommunications. They are defined as follows, where Q is the angular 
frequency. 2. = 27 Fo, where Fo is the frequency of the signal. 


Note: x(t) = sin(2t) 


Note: x(t) = cos(t) 


The complex exponential function 


The complex exponential function is central to signal processing and some 
call it the most important signal. 2 = / —1 is the imaginary unit. 


Note: x(t) = e’ 


Euler's relations 
The complex exponential function can be written as a sum of its real and 
imaginary part. 
Equation: 
a(t) = e** = cos(Mt) + isin(Mt) 


By complex conjugating [link] and add / subtract the result with [link] we 
obtain Euler's relations. 


ei + 9— (it) 


Note: cos(2t) = : 


eit _9— (it) 


Note: sin(.2t) = s 


The importance of Euler's relations can hardly be stressed enough. 


Matlab file 


unit_step_analog.m 
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Discrete vs Analog 


When comparing analog vs discrete time, we find that there are many 
similarities. Often we only need to substitute the varible t with n and 
integration with summation. Still there are some important differences that 
we need to know. As the complex exponential signal is truly central to 
signal processing we will study that in more detail. 


Analog 


The complex exponential function is defined: a(t) = e’*. If Q(rad/second) 
is increased the rate of oscillation will increase continuously. The complex 
exponential function is also periodic for any value of Q. In figure [link] we 
have plotted e’” and e’3” (the real parts only). In [link] we see that the red 
plot, corresponding to a higher value of Q, has a higher rate of oscillation. 


Real parts 


Real parts of complex exponentials. 


Discrete time 


The discrete time complex exponential function is defined: x(n) = e’””. 
If we increase @ (rad/sample) the rate of oscillation will increase and 
decrease periodically. The reason is: e"(@+27*)n — eiwnei2nkn — eiwn where 
n,k € Z. 


This implies that the complex exponential with digital angular frequency 
is identical to a complex exponential with w; = w + 27, see [link] 


Real parts 


Two discrete exponentials that are 
identical 


The rate of oscillation will increase until w = 7, then it decreases and 
repeats after 27. In [link] we see that as we increase the angular frequency 


towards 7 the rate of oscillation increases. If you download the Matlab files 
included at the end of this module you can adjust the parameters and see 
that the rate of oscillation will decrease when exceeding mt (but less than 
2m). 


Two discrete exponentials with 
different frequency. 


Note: We need to consider discrete time exponentials at an (digital 
angular) frequency interval of 27 only. 


Low (digital angular) frequencies will correspond to @ near even multiplies 
of m. High (digital angular) frequencies will correspond to @ near odd 
multiplies of 1. 


Matlab files 
complex_exponential.m 
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Frequency definitions and periodicity 


Frequency definitions 


In signal processing we use several types of frequencies. This may seem 
confusing at first, but it is really not that difficult. 


Analog frequency 


The frequency of an analog signal is the easiest to understand. A 
trigonometric function with argument (2 = 27Ft generates a periodic 
function with 


e asingle frequency F. 
e period T 


e the relation J’ = 4: 


Frequency is then interpreted as how many periods there are per time unit. 
If we choose seconds as our time unit, frequency will be measured in Hertz, 
which is most common. 


Digital frequency 


The digital frequency is defined as f = -, where F’, is the sampling 


frequency. The sampling interval is the inverse of the sampling frequency, 
df -. A discrete time signal with digital frequency f therefore has a 


frequency given by F = fF, if the samples are spaced at T, = oe 


Consequences 


In design of digital sinusoids we do not have to settle for a physical 
frequency. We can associate any physical frequency F with the digital 
frequency f, by choosing the appropriate sampling frequency F’,. (Using the 
relation f = =) 


According to the relation 7; = + choosing an appropriate sampling 


frequency is equivivalent to choosing a sampling interval, which implies 
that digital sinusoids can be designed by specifying the sampling interval. 


Angular frequencies 


The angular frequencies are obtained by multiplying the frequencies by the 
factor 27: 


e Angular frequency (2 = 27F 
¢ Digital angular frequency w = 27f 


Signal periodicity 


Any analog sine or cosine function is periodic. So it may seem surprising 
that discrete trigonometric signals not necessarily are periodic. Let us define 
periodicity mathematically. 


If for all k € Z we have 


e Analog signals z(t) = x(n + kT), then x(t) is periodic with period 
Ts, 

¢ Discrete time signals x(n) = x(n + kN), then x(n) is periodic with 
period N. 


Example: 
Consider the signal «(t) = sin(27F't) which obviously is periodic. You 
can check by using the periodicity definition and some trigonometric 


identitites. 


Example: 

Consider the signal x(n) = sin(27fn). Q:Is this signal periodic? 

A: To check we will use the periodicity definition and some trigonometric 
identities. 

Periodicity is obtained if we can find an N which leads to 

x(n) = z(n+ kN) for all & € Z. Let us expand sin(27f (n+ kN)). 
Equation: 


sin(27f(n+kN)) = sin(27fn) cos(2rfkN) + cos(2rfn) sin(2rfkN) 


To make the right hand side of [link] equal to sin(27 fn), we need to 
impose a restriction on the digital frequency f. According to [link] only 
fN = mwill yield periodicity, m € Z. 


Example: 
Consider the following signals x(t) = cos(2m x 4¢) and 


(1) es (27 x =n), as shown in [link]. 


a) cos (27 x =t) b) cos (27 x =n) 
// \ 1 
avava /\ Pa 
-4 -1 
-10 5 0 5 10 -10 5 0 5 10 
t n 


Are the signals periodic, and if so, what are the periods? 


Both the physical and digital frequency is 1/8 so both signals are periodic 
with period 8. 


Example: 
Consider the following signals x(t) = cos(2m x 2¢) and 
720) sors (27 x =n), as shown in [link]. 
a) cos (27 x 2t) b) cos (27 x 2n) 
1 1 
| il oyna. 
{0 -{ 
10 5 0 5 10 -10 5 0 5 10 
t n 


Are the signals periodic, and if so, what are the periods? 

The frequencies are 2/3 in both cases. The analog signal then has period 
3/2. The discrete signal has to have a period that is an integer, so the 
smallest possible period is then 3. 


Example: 
Consider the following signals x(t) = cos(2t) and x(n) = cos(2n), as 
shown in [link]. 


a) cos(2t) b) cos(2n) 


1 1 
| 
0 0 
4 =" 
10 £ fr E 10-0 4 5 0 
t 


Are the signals periodic, and if so, what are the periods? 

The frequencies are 1/m in both cases. The analog signal then has period t. 
The discrete signal is not periodic because the digital frequency is not a 
rational number. 


Conclusion 


For a time discrete trigonometric signal to be periodic its digital frequency 
has to be a rational number, i.e. given by the ratio of two integers. 
Contrast this to analog trigonometric signals. 


Matlab file 
periodicity.m 
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Energy and Power 


From physics we've learned that energy is work and power is work per time 
unit. Energy was measured in Joule (J) and work in Watts(W). In signal 
processing energy and power are defined more loosely without any 
necessary physical units, because the signals may represent very different 
physical entities. We can say that energy and power are a measure of the 


' W 


signal's "size". 
Signal Energy 


Analog signals 


Since we often think of a signal as a function of varying amplitude through 
time, it seems to reason that a good measurement of the strength of a signal 
would be the area under the curve. However, this area may have a negative 
part. This negative part does not have less strength than a positive signal of 
the same size. This suggests either squaring the signal or taking its absolute 
value, then finding the area under that curve. It turns out that what we call 
the energy of a signal is the area under the squared signal, see [link] 


Note: E, = [~. (|a(t)|)? dt 


CO 


Note that we have used squared magnitude(absolute value) if the signal 
should be complex valued. If the signal is real, we can leave out the 
magnitude operation. 


Sketch of energy calculation 


-1 -0.5 0 0.6 1 


The energy of x(t) is the shaded 
region 


Discrete signals 


For time discrete signals the "area under the squared signal" makes no 
sense, SO we will have to use another energy definiton. We define energy as 
the sum of the squared magnitude of the samples. Mathematically 


Example: 


Given the sequence y(1) = b'u(1), where u(1) is the unit step function. Find 
the energy of the sequence. 

We recognize y(l) as a geometric series. Thus we can use the formula for 
the sum of a geometric series and we obtain the energy, 

i) en (y(1))? = ae This expression is only valid for |b] < 1. If 
we have a larger |b|, the series will diverge. The signal y(1) then has infinite 
energy. So let's have a look at power... 


Signal Power 


Our definition of energy seems reasonable, and it is. However, what if the 
signal does not decay fast enough? In this case we have infinite energy for 
any such signal. Does this mean that a fifty hertz sine wave feeding into 
your headphones is as strong as the fifty hertz sine wave coming out of your 
outlet? Obviously not. This is what leads us to the idea of signal power, 
which in such cases is a more adequate description. 


-T/2 Ti2 


Signal with inifinite energy 


Analog signals 


For analog signals we define power as energy per time interval. 


To 


Note: P, = 7+ f %, (|x(t)|)° dt 


Discrete signals 


For time discrete signals we define power as energy per sample. 


Ni+N-1 2 
Note: Py = + See (|z(n)|) 


Example: 
Given the signals x(t) = sin(2mt) and x2(n) = sin(w 357), shown in 
[link], calculate the power for one period. 


For the analog sine we have P, = + i sin*(2nt) dt = >. 
For the discrete sine we get Py = <= aa sin? (aan?) = 0.500. 


Download power_sine.m for plots and calculation. 


Analog and discrete time sine. 


1 
asl 
eg 
x 
0.5 j 
4 
1 05 OF O58 1 15 2 
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Analog sine 


Discrete time sine 


Matlab files 


energy_area.m power sine.m 
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Exercises 


Problems related to the Signals chapter. 
Exercise: 


Problem: 


Find the digital frequency of x(n) = cos (27 x V3n). Is the signal 


periodic? If so, find the shortest possible period. 


Solution: 


Write cos (27 x V 3n) as cos(27 fn), where f is the digital 


frequency. We see that the digital frequency is /3. Fora trigonometric 
signal to be periodic the digital frequency has to be a rational number, 
i.e f = 47. where both m,N are integers. N is the signal period. Here 
the digital frequency is not a rational number, hence the signal is not 
periodic. 

Exercise: 


Problem: 


Find the digital frequency of x(n) = cos (27 x V4n). Is the signal 


periodic? If so, find the shortest possible period. 


Solution: 


Write cos (27 x V 4n) as cos(27 fn), where f is the digital 


frequency. We see that the digital frequency is V4 = 2.Fora 
trigonometric signal to be periodic the digital frequency has to be a 
rational number, i.e f = 4, where both m,N are integers. N is the 
signal period. In this case the digital frequency is a rational number, 
iz z, hence the signal is periodic. The period, N, is given by 
N= + = ae Since N has to be an integer, we obtain the shortest 
possible period letting m = 2, which yields N = 1. 


Exercise: 


Problem: 


Find the digital frequency of x(n) = sin(271.5n). Is the signal 
periodic? If so, find the shortest possible period. 


Solution: 


Write sin(271.5n) as sin(27 fn), where f is the digital frequency. We 
see that the digital frequency is 1.5. The digital frequency is a rational 
number(3/2), hence the signal is periodic. The period, N, is given by 


N= + = — . Since N has to be an integer, we obtain the shortest 


possible period letting m = 3, which yields N = 2. 


Exercise: 


Problem: 


Referring to example 2 find the analog and digital frequency of x(t) 
and x(n) respectively. 


Solution: 
Using the same reasoning as above we easily see that the analog sine 


has frequency 1, while the discrete time sine has digital frequency 
1/20. 


Introduction to Convolution 


In addition to the operations performed on signals in the Signals chapter 
there are several more. The most important operation is linear filtering, 
which can be performed by convolution. The reason that linear filtering is 
so important to signal processing is that it solves many problems and that is 
relatively simple to describe mathematically. In this chapter we will be 
looking at convolution. 


Convolution helps to determine the effect a system has on an input signal. It 
can be shown that a linear, time-invariant system is completely 
characterized by its impulse response. Using the sampling property of the 
delta function for for continuous time signals and the unit sample for 
discrete time signals we can decompose a signal into an infinite sum / 
integral of scaled and shifted impulses. By knowing how a system affects a 
single impulse, and by understanding the way a signal is comprised of 
scaled and summed impulses, it seems reasonable that it should be possible 
to scale and sum the impulse responses of a system in order to determine 
what output signal will results from a particular input. This is precisely 
what convolution does - convolution determines the system's output 
from knowledge of the input and the system's impulse response. 
Contents of this chapter 


e Introduction (current module) 
¢ Convolution - Discrete time 

e¢ Convolution - Continuous time 
e Properties of convolution 


Discrete Time Convolution 

Convolution is a concept that extends to all systems that are both linear and 
time-invariant (LTT). It will become apparent in this discussion that this 
condition is necessary by demonstrating how linearity and time-invariance 
give rise to convolution. 


Introduction 


Convolution, one of the most important concepts in electrical engineering, 
can be used to determine the output a system produces for a given input 
signal. It can be shown that a linear time invariant system is completely 
characterized by its impulse response. The sifting property of the discrete 
time impulse function tells us that the input signal to a system can be 
represented as a sum of scaled and shifted unit impulses. Thus, by linearity, 
it would seem reasonable to compute of the output signal as the sum of 
scaled and shifted unit impulse responses. That is exactly what the 
operation of convolution accomplishes. Hence, convolution can be used to 
determine a linear time invariant system's output from knowledge of the 
input and the impulse response. 


Convolution and Circular Convolution 


Convolution 


Operation Definition 


Discrete time convolution is an operation on two discrete time signals 
defined by the integral 
Equation: 


(F*9)n]= 3° flag in — A 


k=—0o 


for all signals f, g defined on Z. It is important to note that the operation of 
convolution is commutative, meaning that 
Equation: 


f*9 = 9" f 


for all signals f, g defined on Z. Thus, the convolution operation could 
have been just as easily stated using the equivalent definition 
Equation: 


(F*9) In] = 3° f[n— Ho [A 


k=—0o 


for all signals f, g defined on Z. Convolution has several other important 
properties not listed here but explained and derived in a later module. 


Definition Motivation 


The above operation definition has been chosen to be particularly useful in 
the study of linear time invariant systems. In order to see this, consider a 
linear time invariant system H with unit impulse response h. Given a 
system input signal z we would like to compute the system output signal 
H(z). First, we note that the input can be expressed as the convolution 
Equation: 


by the sifting property of the unit impulse function. By linearity 
Equation: 


Since H(d|n — k]) is the shifted unit impulse response h[n — k], this gives 
the result 
Equation: 


Hence, convolution has been defined such that the output of a linear time 
invariant system is given by the convolution of the system input with the 
system unit impulse response. 


Graphical Intuition 


It is often helpful to be able to visualize the computation of a convolution in 
terms of graphical processes. Consider the convolution of two functions 


f,g given by 
Equation: 


(F*9) [In] = 3° flAlgin—&] = > fm — Mol. 


k=—0o k=—0o 


The first step in graphically understanding the operation of convolution is to 
plot each of the functions. Next, one of the functions must be selected, and 
its plot reflected across the k = 0 axis. For each real n, that same function 
must be shifted left by n. The point-wise product of the two resulting plots 
is then computed, and then all of the values are summed. 


Example: 
Recall that the impulse response for a discrete time echoing feedback 
system with gain a is 


Equation: 
h|n] =a"u [nl], 


and consider the response to an input signal that is another exponential 
Equation: 


We know that the output for this input is given by the convolution of the 
impulse response with the input signal 
Equation: 


y|n] = x[n|*h[n). 


We would like to compute this operation by beginning in a way that 
minimizes the algebraic complexity of the expression. However, in this 
case, each possible choice is equally simple. Thus, we would like to 
compute 

Equation: 


y|n| = S a*u [k]b” *u [n — ky. 


k=—o0o 
The step functions can be used to further simplify this sum. Therefore, 
Equation: 
y|n] = 0 


forn < 0 and 
Equation: 


nfl => ak 
k=0 


for n > 0. Hence, provided ab ¥ 1, we have that 
Equation: 


Circular Convolution 


Discrete time circular convolution is an operation on two finite length or 
periodic discrete time signals defined by the sum 
Equation: 


(f ®g) [n] = S~ f [kg [n — 


for all signals f, g defined on Z[0, N — 1] where f, g are periodic 
extensions of f and g. It is important to note that the operation of circular 
convolution is commutative, meaning that 

Equation: 


f®eg=g9ef 


for all signals f, g defined on Z[0, N — 1]. Thus, the circular convolution 
operation could have been just as easily stated using the equivalent 
definition 

Equation: 


(f ®g)[n] = Sf [n — kg [ki] 


for all signals f, g defined on Z[0, N — 1] where f, g are periodic 
extensions of f and g. Circular convolution has several other important 
properties not listed here but explained and derived in a later module. 


Alternatively, discrete time circular convolution can be expressed as the 
sum of two summations given by 


Equation: 
n N-1 
(f@g)[n]=)_ flklgin—k]+ S° flkigin—-k+N] 
k=0 k=n+1 


for all signals f, g defined on Z[0, N — 1). 


Meaningful examples of computing discrete time circular convolutions in 
the time domain would involve complicated algebraic manipulations 
dealing with the wrap around behavior, which would ultimately be more 
confusing than helpful. Thus, none will be provided in this section. Of 
course, example computations in the time domain are easy to program and 
demonstrate. However, disrete time circular convolutions are more easily 
computed using frequency domain tools as will be shown in the discrete 
time Fourier series section. 


Definition Motivation 


The above operation definition has been chosen to be particularly useful in 
the study of linear time invariant systems. In order to see this, consider a 
linear time invariant system H with unit impulse response h. Given a 
periodic system input signal x we would like to compute the system output 
signal H(a). First, we note that the input can be expressed as the circular 
convolution 

Equation: 


by the sifting property of the unit impulse function. By linearity, 
Equation: 


Since H(6|n — k]) is the shifted unit impulse response h[n — k], this gives 
the result 
Equation: 


Hence, circular convolution has been defined such that the output of a linear 
time invariant system is given by the convolution of the system input with 
the system unit impulse response. 


Graphical Intuition 


It is often helpful to be able to visualize the computation of a circular 
convolution in terms of graphical processes. Consider the circular 
convolution of two finite length functions f, g given by 

Equation: 


(f ® 9) [n] = > FRG in —k] = SO F In — kG IK. 


The first step in graphically understanding the operation of convolution is to 
plot each of the periodic extensions of the functions. Next, one of the 
functions must be selected, and its plot reflected across the k = 0 axis. For 
each n € Z|0, N — 1], that same function must be shifted left by n. The 
point-wise product of the two resulting plots is then computed, and finally 
all of these values are summed. 


Interactive Element 


vtimeshiftDemo 


Interact (when online) with the Mathematica CDF 
demonstrating Discrete Linear Convolution. To 
download, right click and save file as .cdf 


Convolution Summary 


Convolution, one of the most important concepts in electrical engineering, 
can be used to determine the output signal of a linear time invariant system 
for a given input signal with knowledge of the system's unit impulse 
response. The operation of discrete time convolution is defined such that it 
performs this function for infinite length discrete time signals and systems. 
The operation of discrete time circular convolution is defined such that it 
performs this function for finite length and periodic discrete time signals. In 
each case, the output of the system is the convolution or circular 
convolution of the input signal with the unit impulse response. 


Convolution - Analog 


In this module we examine convolution for continuous time signals. This 
will result in the convolution integral and its properties. These concepts are 
very important in Electrical Engineering and will make any engineer's life a 
lot easier if the time is spent now to truly understand what is going on. 


In order to fully understand convolution, you may find it useful to look at 
the discrete-time convolution as well. Accompanied to this module there is 
a fully worked out example with mathematics and figures. It will also be 
helpful to experiment with the Convolution - Continuous time applet 
available from Johns Hopkins University. These resources offers different 
approaches to this crucial concept. 


Derivation of the convolution integral 


The derivation used here closely follows the one discussed in the 
motivation section above. To begin this, it is necessary to state the 
assumptions we will be making. In this instance, the only constraints on our 
system are that it be linear and time-invariant. 

Brief Overview of Derivation Steps: 


1. An impulse input leads to an impulse response output. 

2. A shifted impulse input leads to a shifted impulse response output. 
This is due to the time-invariance of the system. 

3. We now scale the impulse input to get a scaled impulse output. This is 
using the scalar multiplication property of linearity. 

4. We can now "sum up" an infinite number of these scaled impulses to 
get a sum of an infinite number of scaled impulse responses. This is 
using the additivity attribute of linearity. 

5. Now we recognize that this infinite sum is nothing more than an 
integral, so we convert both sides into integrals. 

6. Recognizing that the input is the function f(t), we also recognize that 
the output is exactly the convolution integral. 


We begin with a 
system defined by 
its impulse 
response, h(t). 


d(t — 7) —{» }- h{t— +) 


We then consider a 
shifted version of the 
input impulse. Due to the 
time invariance of the 
system, we obtain a 
shifted version of the 
output impulse response. 


fer)ate—7) —| > /- fone) 


Now we use the scaling part of 
linearity by scaling the system 


by a value, f(z), that is constant 
with respect to the system 
variable, t. 


Prefect — rar —[» ;- Peed (rat —7) dr 


We can now use the additivity aspect of 
linearity to add an infinite number of these, 
one for each possible 7. Since an infinite 
sum is exactly an integral, we end up with 
the integration known as the Convolution 
Integral. Using the sampling property, we 
recognize the left-hand side simply as the 
input f(t). 


Convolution Integral 


As mentioned above, the convolution integral provides an easy 
mathematical way to express the output of an LTI system based on an 
arbitrary signal, x(t), and the system's impulse response, h(t). The 
convolution integral is expressed as 

Equation: 


Convolution is such an important tool that it is represented by the symbol *, 
and can be written as 
Equation: 


y(t) = x(t) *h(t) 


By making a simple change of variables into the convolution integral, 
T = t —T, we can easily show that convolution is commutative: 
Equation: 


x(t)*h(t) = h(t)*2(t) 


which gives an equivivalent form of [link] 
Equation: 


For more information on the characteristics of the convolution integral, read 
about the Properties of Convolution. 


Implementation of Convolution 


Taking a closer look at the convolution integral, we find that we are 
multiplying the input signal by the time-reversed impulse response and 
integrating. This will give us the value of the output at one given value of ft. 
If we then shift the time-reversed impulse response by a small amount, we 
get the output for another value of t. Repeating this for every possible value 
of t, yields the total output function. While we would never actually do this 
computation by hand in this fashion, it does provide us with some insight 
into what is actually happening. We find that we are essentially reversing 
the impulse response function and sliding it across the input function, 
integrating as we go. This method, referred to as the graphical method, 
provides us with a much simpler way to solve for the output for simple 
(contrived) signals, while improving our intuition for the more complex 


cases where we rely on computers. In fact ‘Texas Instruments develops 
Digital Signal Processors which have special instruction sets for 
computations such as convolution. 


Summary 


Convolution is a truly important concept, which must be well understood. 


Note: y(t) = ~ 2x(r)h(t—7) dr 


—oo 


Note: y(t) = ~ h(r)a(t—7) dr 


—oo 


Go to? Introduction Convolution - Full example Convolution - Discrete 
time Properties of convolution 


Convolution - Complete example 


Basic Example 
Let us look at a basic continuous-time convolution example to help express 


some of the important ideas. We will convolve together two square pulses, 
z(t) and A(t), as shown in [link] 


Two basic signals that we will convolve together. 


Reflect and Shift 


Now we will take one of the functions and reflect it around the y-axis. Then 
we must shift the function, such that the origin, the point of the function 
that was originally on the origin, is labeled as point t. This step is shown in 
[link], h(t — 7). 


Reflected square pulse. Reflected and shifted square 
pulse. 


h(—r) and h(t — 7). 


Note that in [link] 7 is the 1st axis variable while t is a constant (in this 
figure). Since convolution is commutative it will never matter which 
function is reflected and shifted; however, as the functions become more 
complicated reflecting and shifting the "right one" will often make the 
problem much easier. 


Regions of Integration 


We start out with the convolution integral, y(t) = f°, 2(r)h(t — 7) dr. 


The value of the function y at time ¢ is given by the amount of overlap(to be 
precise the integral of the overlapping region) between h(t — 7) and x(r). 


Next, we want to look at the functions and divide the span of the functions 


into different limits of integration. These different regions can be 
understood by thinking about how we slide h(t — 7) over x(7), see [link]. 


Figures to help understand the regions of intergration 


h(t — 7) on its way "out of" (7) 


No overlap. 


In this case we will have the following four regions. Compare these limits 
of integration to the four illustrations of h(t — 7) and x(r) in [link]. 
Four Limits of Integration 


Using the Convolution Integral 


Finally we are ready for a little math. Using the convolution integral, let us 
integrate the product of x(7)h(t — 7). For our first and fourth region this 
will be trivial as it will always be 0. The second region, 0 < ¢ < 1, will 
require the following math: 

Equation: 


Juldar 
= 2 


| 


y(t) 


The third region, 1 < t < 2, is solved in much the same manner. Take note 
of the changes in our integration though. As we move h(t — 7) across our 


other function, the left-hand edge of the function, t — 1, becomes our 
lowlimit for the integral. This is shown through our convolution integral as 
Equation: 


y(t) = fi ldr 
= 1-(t-1) 
2-—t 


The above formulas show the method for calculating convolution; however, 
do not let the simplicity of this example confuse you when you work on 
other problems. The method will be the same, you will just have to deal 
with more math in more complicated integrals. 


Note that the value of y(t) at all time is given by the integral of the 
overlapping functions. In this example y for a given t equals the gray area 
in the plots in [link]. 


Convolution Results 


Thus, we have the following results for our four regions: 
Equation: 


0 if t<0 

tif 0<t<1 
2-tifl<t<2 
Oif t>2 


Now that we have found the resulting function for each of the four regions, 
we can combine them together and graph the convolution of x(t)*h(t). 


Shows the system's 
output in response to 
the input, x(t). 


Common sense approach 


By looking at [link] we can obtain the system output, y(t), by "common" 
sense. For t < 0 there is no overlap, so y(t) is 0. As ¢ goes from 0 to 1 the 
overlap will linearly increase with a maximum for ¢ = 1, the maximum 
corresponds to the peak value in the triangular pulse. As ¢ goes from 1 to 2 
the overlap will linearly decrease. For t > 2 there will be no overlap and 
hence no output. 


We see readily from the "common" sense approach that the output function 
y(t) is the same as obtained above with calculations. When convolving to 
square pulses the result will always be a triangular pulse. Its origin, peak 
value and strech will, of course, vary. 
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Properties of Continuous Time Convolution 
This module discusses the properties of continuous time convolution. 


Introduction 


We have already shown the important role that continuous time convolution plays in 
signal processing. This section provides discussion and proof of some of the 
important properties of continuous time convolution. Analogous properties can be 
shown for continuous time circular convolution with trivial modification of the proofs 
provided except where explicitly noted otherwise. 


Continuous Time Convolution Properties 


Associativity 


The operation of convolution is associative. That is, for all continuous time signals 
X1, £2, x3 the following relationship holds. 
Equation: 


x1*(x9*a3) = (x1*x2)*x3 


In order to show this, note that 
Equation: 


eee = a - a : Riese ean 
=— i [. Ly (71) x2 (74 + T2) _ T1)23 (t — (7 + 72))dtod7, 


= / : LI (71) x2 (73 — 71)x3 (t — 73)dt\drT3 


= ((#1*a2)*as) (t) 


proving the relationship as desired through the substitution 73 = 7, + 72. 


Commutativity 


The operation of convolution is commutative. That is, for all continuous time signals 
£1, £2 the following relationship holds. 
Equation: 


£1 *x2 = L*21 


In order to show this, note that 
Equation: 


(x1*z2)(t) = a wi (71) @2 (t — T1)d71 
= es £1 (t — 72) x2 (T2)dr2 


(oe) 


= (x2*21) (t) 


proving the relationship as desired through the substitution Tz = t — 7}. 


Distributivity 


The operation of convolution is distributive over the operation of addition. That is, for 
all continuous time signals x1, 2, x3 the following relationship holds. 
Equation: 


@1*(@o + 43) = £1*tq+ 21% 23 


In order to show this, note that 
Equation: 


(1*(a2 + #3))(t) = i. #1 (7) (2 (t — 7) + a3 (t — 7) )dr 


(oe) 


= c: x1 (T)ax2 (t —7)dr + is x1 (T)x3 (t — T)dr 


(oe) —oo 


= (x1*x2 + £1*23) (t) 


proving the relationship as desired. 


Multilinearity 


The operation of convolution is linear in each of the two function variables. 
Additivity in each variable results from distributivity of convolution over addition. 
Homogenity of order one in each variable results from the fact that for all continuous 


time signals x1, Z2 and scalars a the following relationship holds. 
Equation: 


a (x1*x2) = (ax1)*xq = 21*(ar2) 


In order to show this, note that 
Equation: 


proving the relationship as desired. 


Conjugation 


The operation of convolution has the following property for all continuous time 


signals 21, 9. 
Equation: 


21*x9 = L1*29 


In order to show this, note that 
Equation: 


proving the relationship as desired. 


Time Shift 


The operation of convolution has the following property for all continuous time 
signals 71, £2 where S77 is the time shift operator. 
Equation: 


Sr (x1 *22) = (S'px1)*xo = £1*(Spr2) 


In order to show this, note that 


Equation: 
Sea) = a 7 ree) Saad 
7 / - PRE ee 
= ((Spx;)*z2) (t) 
= ie ae E=D) sae 
a) eons aaa 
= 2,"(Spay) (f 


proving the relationship as desired. 


Differentiation 
The operation of convolution has the following property for all continuous time 


signals 21, £2. 
Equation: 


In order to show this, note that 


Equation: 
d a d 
nae ee = a _ 
5 (21*22) (t) i. ta (1) 0 (t= 1dr 
d. 
= (See, (t) 
t 
id d 
=F LI (7) 22 (t —7)dr 


proving the relationship as desired. 


Impulse Convolution 


The operation of convolution has the following property for all continuous time 
signals x where 6 is the Dirac delta funciton. 
Equation: 


x*s=2 


In order to show this, note that 
Equation: 


(2*8)(t) = / eee lodr 
2 (t) ie Sea 
x(t) 


| 


proving the relationship as desired. 


Width 


The operation of convolution has the following property for all continuous time 
signals 21, £2 where Duration () gives the duration of a signal x. 
Equation: 


Duration (#1*z2) =Duration (x;)+ Duration (22) 


. In order to show this informally, note that (21 *z-2) (t) is nonzero for all t for which 
there is a 7 such that x; (7)a» (t — T) is nonzero. When viewing one function as 
reversed and sliding past the other, it is easy to see that such a 7 exists for all ¢ on an 
interval of length Duration (2;)+ Duration (x2). Note that this is not always true 
of circular convolution of finite length and periodic signals as there is then a 
maximum possible duration within a period. 


Convolution Properties Summary 


As can be seen the operation of continuous time convolution has several important 
properties that have been listed and proven in this module. With slight modifications 
to proofs, most of these also extend to continuous time circular convolution as well 
and the cases in which exceptions occur have been noted above. These identities will 
be useful to keep in mind as the reader continues to study signals and systems. 


Frequency response from a circuit diagram 


In this module we calculate the frequency response from a circuit diagram 
of a simple analog filter, as shown in [link]. We know that the frequency 
response, denoted by H(i({2)), is calculated as ratio of the output and input 
voltages (in the frequency domain). That is, 

Equation: 


Vout 
Vin 


= H(i) 


Notice that we use capital letters in these relations. This is to indicate that 
they are frequency domain descriptions. 


Now, to calculate the frequency response we find expressions for V;,, and 
Vout, as follows 
Equation: 


Vin = IR + Vout 


Further, the current in the circuit can be expressed as 
Equation: 


I = IV DV out 


Then, the frequency response is given as: 
Equation: 


wt = H(i) 
a so 
iNRC+1 
Note that above we have used impedance considerations. Have a look at 


The Impedance concept and Impedance for a quick summary of impedance 
considerations. 


Implicit in using the transfer function is that the input is a complex 
exponential, and the output is also a complex exponential having the same 
frequency. The transfer function reveals how the circuit modifies the input 
amplitude in creating the output amplitude. Thus, the transfer function 
completely describes how the circuit processes the input complex 
exponential to produce the output complex exponential. The circuit's 
function is thus summarized by the transfer function. In fact, circuits are 
often designed to meet transfer function specifications. Because transfer 
functions are complex-valued, frequency-dependent quantities, we can 
better appreciate a circuit's function by examining the magnitude and phase 
of its transfer function ([link]). Note that in [link] we plot the magnitude 
phase as a function of the frequency F’, instead of the angular frequency £2. 
Since {2 = 27F,, this is just a matter of taste, see Frequency definitions and 
peridocity for details. 

Simple Circuit 


A simple 
RC circuit. 


Magnitude and phase of the transfer function 


Magnitude and phase of the transfer function of the RC circuit shown 
in [link] when RC = 1. 


|HGF)| 


Several things to note about this transfer function. 


We can compute the frequency response for both positive and negative 
frequencies. Recall that sinusoids consist of the sum of two complex 
exponentials, one having the negative frequency of the other. We will 
consider how the circuit acts on a sinusoid soon. Do note that the magnitude 
has even symmetry: The negative frequency portion is a mirror image of 
the positive frequency portion: |H(— (¢F'))| = |H(zF)|. The phase has 
odd symmetry: 7(H(— (iF'))) = —Z(A(iF)). These properties of this 


specific example apply for all transfer functions associated with circuits. 
Consequently, we don't need to plot the negative frequency component; we 
know what it is from the positive frequency part. 


The magnitude equals —L of its maximum gain (1 at F = 0) when 


V2 
27F'RC = 1 (the two terms in ve denominator of the magnitude are 
equal). The frequency F, = >5-— ope defines the boundary between two 


operating ranges. 


e For frequencies below this frequency, the circuit does not much alter 
the amplitude of the complex exponential source. 

e For frequencies greater than F, the circuit strongly attenuates the 
amplitude. Thus, when the source frequency is in this range, the 
circuit's output has a much smaller amplitude than that of the source. 


For these reasons, this frequency is known as the cutoff frequency. In this 
circuit the cutoff frequency depends only on the product of the resistance 
ae the capacitance. Thus, a cutoff frequency of 1 kHz occurs when 

a = 10° or RC = i * = 1.59 x 10-4. Thus resistance- -capacitance 
combinations of 1.59 kQ onl 100 nF or 10 Q and 1.59 pF result in the same 
cutoff frequency. 


The phase shift caused by the circuit at the cutoff frequency precisely 
equals —-—-. Thus, below the cutoff frequency, phase is little aaa but at 
higher iaquenciés, the phase shift caused by the circuit becomes — 4. This 
phase shift corresponds to the difference between a cosine and a sine 


We can use the transfer function to find the output when the input voltage is 
a sinusoid for two reasons. First of all, a sinusoid is the sum of two complex 
exponentials, each having a frequency equal to the negative of the other. 
Secondly, because the circuit is linear, superposition applies. If the source is 
a sine wave, we know that 

Equation: 


A sin(t) 
A. (gi _ ¢(im)) 


21 


Vin(t) 


Since the input is the sum of two complex exponentials, we know that the 
output is also a sum of two similar complex exponentials, the only 
difference being that the complex amplitude of each is multiplied by the 
transfer function evaluated at each exponential's frequency. 

Equation: 


ee ae ee ene 
Vout(t) = 5; eee — ape (i))e (it) 


As noted earlier, the transfer function is most conveniently expressed in 
polar form: H(i) = |H(iM)\e’<4), Furthermore, 

|H(— (¢2))| = |H(z22)| (even symmetry of the magnitude) and 

Z(A(— (i2))) = —Z(H(i22)) (odd symmetry of the phase). The output 
voltage expression simplifies to 

Equation: 


vou(t) = A|H(éQ)|sin(Qt + Z(H(i))) 
= 4 |H(i(Q)) ett A) = 4 | H(i) ef @%)- <A) 


The circuit's output to a sinusoidal input is also a sinusoid, having a 
gain equal to the magnitude of the circuit's transfer function evaluated 
at the source frequency and a phase equal to the phase of the transfer 
function at the source frequency. It will turn out that this input-output 
relation description applies to any linear circuit having a sinusoidal source. 


The notion of impedance arises when we assume the sources are complex 
exponentials. This assumption may seem restrictive; what would we do if 
the source were a unit step? When we use impedances to find the transfer 
function between the source and the output variable, we can derive from it 
the differential equation that relates input and output. The differential 
equation applies no matter what the source may be. As we have argued, it is 
far simpler to use impedances to find the differential equation (because we 
can use series and parallel combination rules) than any other method. In this 


sense, we have not lost anything by temporarily pretending the source is a 
complex exponential. 


In fact we can also solve the differential equation using impedances! Thus, 
despite the apparent restrictiveness of impedances, assuming complex 
exponential sources is actually quite general. 
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Why sample? 


This section introduces sampling. Sampling is the necessary fundament for 
all digital signal processing and communication. Sampling can be defined 
as the process of measuring an analog signal at distinct points. 


Digital representation of analog signals offers advantages in terms of 


robustness towards noise, meaning we can send more bits/s 

use of flexible processing equipment, in particular the computer 
more reliable processing equipment 

easier to adapt complex algorithms 


Claude E. Shannon 


Claude 
Elwood 
Shannon 

(1916-2001) 


Claude Shannon has been called the father of information theory, mainly 
due to his landmark papers on the !"Mathematical theory _of 


in 1928, but it was not proven until Shannon proved it 21 years later in the 
paper "Communications in the presence of noise". 


Notation 
In this chapter we will be using the following notation 


e Original analog signal x(t) 

e Sampling frequency F’, 

e Sampling interval T,, (Note that: PF’, = =) 

e Sampled signal x,(n). (Note that x,(n) = x(nT;)) 
¢ Real angular frequency 2 

e Digital angular frequency w. (Note that: w = (277) 


The Sampling Theorem 


Note: When sampling an analog signal the sampling frequency must be 
greater than twice the highest frequency component of the analog signal to 
be able to reconstruct the original signal from the sampled version. 


Proof 


Note: In order to recover the signal x(t) from it's samples exactly, it is 
necessary to sample x(t) at a rate greater than twice it's highest frequency 
component. 


Introduction 


As mentioned earlier, sampling is the necessary fundament when we want 
to apply digital signal processing on analog signals. 


Here we present the proof of the sampling theorem. The proof is divided in 
two. First we find an expression for the spectrum of the signal resulting 
from sampling the original signal x(t). Next we show that the signal x(t) 
can be recovered from the samples. Often it is easier using the frequency 
domain when carrying out a proof, and this is also the case here. 

Key points in the proof 


e We find an equation for the spectrum of the sampled signal 
e We find a simple method to reconstruct the original signal 
e The sampled signal has a periodic spectrum... 

e ...and the period is 2 x nF, 


Proof part 1 - Spectral considerations 


By sampling x(t) every T, second we obtain x,(n). The inverse fourier 
transform of this time discrete signal is 
Equation: 


x(n) = = | X,(e)e" dw 


—= 7G 


For convenience we express the equation in terms of the real angular 
frequency {2 using w = T,. We then obtain 
Equation: 


T 


T; i XxX, (er je dQ 


On 


x(n) 


The inverse fourier transform of a continuous signal is 
Equation: 


1 i 
a(t) = / X(iMe! 4.2 


From this equation we find an expression for x (nT; ) 
Equation: 


1 on ; 
(nT) = 5 / X(iDei"™ 4.2 


To account for the difference in region of integration we split the integration 
in [link] into subintervals of length ae and then take the sum over the 
resulting integrals to obtain the complete area. 

Equation: 


(2k+1)7 


2xtk 


Then we change the integration variable, setting (2 = 7 + a 


Equation: 


412 xTkn 


We obtain the final form by observing that e 
eeas T 
and multiplying by + 


= /, reinserting 7 = 2 


Equation: 
7. (a SA 2xnk\\ ,; 
Pe S. =X ;(Q4 inl, gq QC 
ake ) 27 - b= 66 Ts (i ( Ts ))e 


To make z,(n) = x(nT;) for all values of n, the integrands in [link] and 
[link] have to agreee, that is 


Equation: 
iT. 1 2rk 
KM) =F De x( (2+ 7p )) 


$ k=—oo 


This is a central result. We see that the digital spectrum consists of a sum of 
shifted versions of the original, analog spectrum. Observe the periodicity! 


We can also express this relation in terms of the digital angular frequency 


GT, 
Equation: 


sy 


S$ k=—o00 


This concludes the first part of the proof. Now we want to find a 
reconstruction formula, so that we can recover x(t) from x,(n). 


Proof part II - Signal reconstruction 


For a bandlimited signal the inverse fourier transform is 
Equation: 


1 fr 
Ce d X(iMei a2 
27 7. 


T. 


X(i2) 


s 


In the interval we are integrating we have: X,(e"7") = 
Substituting this relation into [link] we get 


Equation: 


A= T's pe X, (cic a 2 


Using the DTFT relation for X, (e177) we have 
Equation: 


Tt. fe = e . 
z(t) = [. S° a,(n)e Prt) ei q Q 


Ts; T=—00 


Interchanging integration and summation (under the assumption of 
convergence) leads to 
Equation: 


Finally we perform the integration and arrive at the important 
reconstruction formula 
Equation: 


(Thanks to R.Loos for pointing out an error in the proof.) 


Summary 


Illustrations 


In this module we illustrate the processes involved in sampling and 
reconstruction. To see how all these processes work together as a whole, 
take a look at the system view. In Sampling and reconstruction with Matlab 
we provide a Matlab script for download. The matlab script shows the 
process of sampling and reconstruction live. 


Basic examples 


Example: 
To sample an analog signal with 3000 Hz as the highest frequency 
component requires sampling at 6000 Hz or above. 


Example: 

The sampling theorem can also be applied in two dimensions, i.e. for 
image analysis. A 2D sampling theorem has a simple physical 
interpretation in image analysis: Choose the sampling interval such that it 
is less than or equal to half of the smallest interesting detail in the image. 


The process of sampling 


We start off with an analog signal. This can for example be the sound 
coming from your stereo at home or your friend talking. 


The signal is then sampled uniformly. Uniform sampling implies that we 
sample every 7’, seconds. In [link] we see an analog signal. The analog 
signal has been sampled at times t = n7T’,. 


Analog signal, samples are marked with dots. 


In signal processing it is often more convenient and easier to work in the 
frequency domain. So let's look at at the signal in frequency domain, [link]. 
For illustration purposes we take the frequency content of the signal as a 
triangle. (If you Fourier transform the signal in [link] you will not get such 
a nice triangle.) 


= 
ot 


The spectrum X (i). 


Notice that the signal in [link] is bandlimited. We can see that the signal is 
bandlimited because X(i{2) is zero outside the interval |[—Q,, 2g]. 
Equivalentely we can state that the signal has no angular frequencies above 


§2,, corresponding to no frequencies above F’, = 5°. 

Now let's take a look at the sampled signal in the frequency domain. While 
proving the sampling theorem we found the the spectrum of the sampled 
signal consists of a sum of shifted versions of the analog spectrum. 
Mathematically this is described by the following equation: 


Equation: 
~~ 2 
> *((era)) 
k=—0o Ts 


X, i) _ 


s+ 


Sampling fast enough 


In [link] we show the result of sampling x(t) according to the sampling 
theorem. This means that when sampling the signal in [link]/[link] we use 
F, > 2F,. Observe in [link] that we have the same spectrum as in [link] for 
2 € |-Q,4, 2], except for the scaling factor ra This is a consequence of 


the sampling frequency. As mentioned in the proof the spectrum of the 


sampled signal is periodic with period 27 Ff, = a 


The spectrum X ,. Sampling frequency is OK. 


So now we are, according to the sample theorem, able to reconstruct the 
original signal exactly. How we can do this will be explored further down 
under reconstruction. But first we will take a look at what happens when we 
sample too slowly. 


Sampling too slowly 


If we sample x(t) too slowly, that is FP’, < 2F',, we will get overlap 
between the repeated spectra, see [link]. According to [link] the resulting 
spectra is the sum of these. This overlap gives rise to the concept of 
aliasing. 


Note: If the sampling frequency is less than twice the highest frequency 
component, then frequencies in the original signal that are above half the 
sampling rate will be "aliased" and will appear in the resulting signal as 
lower frequencies. 


The consequence of aliasing is that we cannot recover the original signal, so 
aliasing has to be avoided. Sampling too slowly will produce a sequence 
x(n) that could have orginated from a number of signals. So there is no 
chance of recovering the original signal. To learn more about aliasing, take 
a look at this module. (Includes an applet for demonstration!) 


2m 4n (2 
Ts Ts 


The spectrum X ,. Sampling frequency is too low. 


To avoid aliasing we have to sample fast enough. But if we can't sample fast 
enough (possibly due to costs) we can include an Anti-Aliasing filter. This 
will not able us to get an exact reconstruction but can still be a good 
solution. 


Note: Typically a low-pass filter that is applied before sampling to ensure 
that no components with frequencies greater than half the sample 
frequency remain. 


Example: 

The stagecoach effect 

In older western movies you can observe aliasing on a stagecoach when it 
starts to roll. At first the spokes appear to turn forward, but as the 
stagecoach increase its speed the spokes appear to turn backward. This 
comes from the fact that the sampling rate, here the number of frames per 
second, is too low. We can view each frame as a sample of an image that is 
changing continuously in time. (Applet illustrating the stagecoach effect) 


Reconstruction 


Given the signal in [link] we want to recover the original signal, but the 
question is how? 


When there is no overlapping in the spectrum, the spectral component given 
by k = 0 (see [link]),is equal to the spectrum of the analog signal. This 
offers an oppurtunity to use a simple reconstruction process. Remember 
what you have learned about filtering. What we want is to change signal in 
[link] into that of [link]. To achieve this we have to remove all the extra 
components generated in the sampling process. To remove the extra 
components we apply an ideal analog low-pass filter as shown in [link] As 
we see the ideal filter is rectangular in the frequency domain. A rectangle in 
the frequency domain corresponds to a sinc function in time domain (and 
vice versa). 


|7(2)| 


H(iQ2) The ideal reconstruction filter. 


Then we have reconstructed the original spectrum, and as we know if two 
signals are identical in the frequency domain, they are also identical in 
the time domain. End of reconstruction. 


Conclusions 


The Shannon sampling theorem requires that the input signal prior to 
sampling is band-limited to at most half the sampling frequency. Under this 
condition the samples give an exact signal representation. It is truly 
remarkable that such a broad and useful class signals can be represented 
that easily! 


We also looked into the problem of reconstructing the signals form its 
samples. Again the simplicity of the principle is striking: linear filtering by 
an ideal low-pass filter will do the job. However, the ideal filter is 
impossible to create, but that is another story... 


Sampling and reconstruction with Matlab 


Matlab files 
Samprecon.m 


Exercises 


Aliasing Applet 


The applet is courtesy of the Digital Signal Processing tutorial at 
freeuk.com, http://www.dsptutor.freeuk.com/. You can also have a look at 
the Light Wheel applet. 


Introduction 


In this module we shall look at sampling a sinusoidal signal. According to 
the sampling theorem, a sinusoidal signal can be exactly reconstructed from 
values sampled at discrete, uniform intervals as long as the signal frequency 
is less than half the sampling frequency. Any component of a sampled 
signal with a frequency above this limit, often referred to as the folding 
frequency, is subject to aliasing. 


The applet is based on a fixed sampling rate of 
F’, = 8000 samples per second (one sample every 0.125 milliseconds, i.e 


—_ _1 
a 8000 ). 


Instructions 


Set the frequency of the sinusoidal signal, in Hz, in the "Input frequency" 
box, i.e choose an f in the following signal: sin(27 ft). When you click the 
"Plot" button, with "Input signal" checked, the input signal is plotted 
against time. 


The "Grid" checkbox toggles on and off vertical gridlines indicating the 
instants at which the signal is sampled. The "Sample points", representing 
the sampled values of the input signal, can also be toggled. 


Finally, the "Alias frequency" checkbox (visible only when aliasing occurs) 
controls the plotting of the "reconstructed" sinusoidal signal, with f = fatias 


Overview of the process 


When using the applet it is important to have an understanding of where the 
different signals occur in a sampling system. 


x(t) 2,(n) x(t) #(t) 


Sampling 


Ideal sampling process 


Relating the applet signals to the figure we get 


e Input signal = x(t) = sin(27ft), where f is the input frequency 
chosen by the user. 

¢ The sampled signal = z,(n) = sin(2fnT;) = sin(2xfn x Tee 

e The reconstructed signal = 7((t)), is shown as the original signal if 
sampling is done fast enough, or as the aliased signal if sampling is too 


slow. 


(h(t) is an ideal reconstruction filter). 


Aliasing demo applet 


Applet failed to run. No Java plug-in was found. 


Hold operation 


Any practical reconstruction system must input finite length pulses into the 
reconstruction filter. The reason is that we need nonzero energy in the 
nonzero pulses. 


Introduction 


The operation performed to produce these pulses is called hold. Using the 
hold-operation we get pulses with a predefined length and height 
proportional to the input to the digital-to-analog converter. By means of the 
hold operation we get nonzero pulses with energy. 


Output signal from the hold device 


As we have made changes relative to the ideal reconstruction, we need to 
look at the output signal the reconstruction filter will give us. Quite 
obviously the output will not be the original signal. So, is it still useful? 


Analysis 


As before, and as will be the situation later, using the frequency domain 
simplifies the analysis. To model the hold operation we use convolution 
with a delta function and a square pulse. The square pulse has unit height 
and duration 7. The duration 7 is the holding time, i.e. how long we hold 
the incoming value. For the pulses not to overlap we must choose T < 7’. 


The convolution can be seen as a filtering operation, using the square pulse 
as the impulse response. If we fourier transform the square pulse we obtain 
the frequency response of the filter, which is a sinc function. 


[link] shows the frequency response of the analog square pulse filter. We 


have plotted the frequency response for T = 7’, and T = 4 


hold amplification (dB) 


Frequency response of the 
analog square filter as a 
function of digital frequency f. 


From the figure we can make the following observations 


e The signal will be attenuated more and more towards the band edge, 
to 

e For 7 = T, the maximum attenuation is 3 dB at f = 0.5. 

e Fort = + the maximum attenuation is 0.82 dB at f = 0.5. 


The distortion is a result of linear operations and can thus be compensated 
for by using a filter with opposite frequency response in the passband, 

f € [—0.5, 0.5]. The compensation will not be exact, but we can make the 
approximation as accurate as we wish. The compensation can be made in 


the reconstruction filter or after the reconstruction by using a separate 
analog filter. One can also predistort the signal in a digital filter before 
reconstruction. Where to put the compensator and it's quality are cost 
considerations. 


Exercises 


Systems view of sampling and reconstruction 


Ideal reconstruction system 


[link] shows the ideal reconstruction system based on the results of the 
Sampling theorem proof. 


[link] consists of a sampling device which produces a time-discrete 
sequence x,(n). The reconstruction filter, h(t), is an ideal analog sinc 


filter, with h(t) = sinc (+ . We can't apply the time-discrete sequence 


x(n) directly to the analog filter h(t). To solve this problem we turn the 
sequence into an analog signal using delta functions. Thus we write 


w(t) = >) ss Bal )OlE— nT), 


x(t) 2,(1) x(t) a(t) 


Ideal reconstruction system 


But when will the system produce an output 7(t) = x(t)? According to the 
sampling theorem we have x(t) = x(t) when the sampling frequency, F’,, 
is at least twice the highest frequency component of x(t). 


Ideal system including anti-aliasing 


To be sure that the reconstructed signal is free of aliasing it is customary to 
apply a lowpass filter, an anti-aliasing filter, before sampling as shown in 
[link]. 


s(t) #(t) 


Ideal reconstruction system with anti-aliasing filter 


Again we ask the question of when the system will produce an output 


x(t) = s(t)? If the signal is entirely confined within the passband of the 
lowpass filter we will get perfect reconstruction if F’, is high enough. 


But if the anti-aliasing filter removes the "higher" frequencies, (which in 
fact is the job of the anti-aliasing filter), we will never be able to exactly 
reconstruct the original signal, s(t). If we sample fast enough we can 
reconstruct x(t), which in most cases is satisfying. 


The reconstructed signal, #(t), will not have aliased frequencies. This is 
essential for further use of the signal. 


Reconstruction with hold operation 


To make our reconstruction system realizable there are many things to look 
into. Among them are the fact that any practical reconstruction system must 
input finite length pulses into the reconstruction filter. This can be 
accomplished by the hold operation. To alleviate the distortion caused by 
the hold opeator we apply the output from the hold device to a compensator. 
The compensation can be as accurate as we wish, this is cost and 
application consideration. 


x(t) z,(n) a(t) 
Sampling Hold Compensate ——- 


More practical reconstruction system with a hold component 


By the use of the hold component the reconstruction will not be exact, but 
as mentioned above we can get as close as we want. 


applet Exercises 


Exercises 


Problems related to the Sampling Theorem module. 
Exercise: 


Problem: Express the sampling theorem in words. 
Solution: 


Fill in the solution here... 
Exercise: 
Problem: 


Theoretically, why is the sinc-function so important for reconstruction? 
Sketch a sinc(t). What are the values for integer values of t? 


Solution: 


Fill in the solution here... 


Exercise: 


Problem: Argue that the sampling rate for CD should be over 40KHz. 
Solution: 


The human ear can hear frequencies up to 20 KHz, so according to the 
sampling theorem we should sample at a rate equal to or exceeding 
40KHz. In practice we always have to sample at more than the double 
rate, partly due to finite precision. 


Exercise: 


Problem: 
(By Don Johnson) 


What is the simplest bandlimited signal? Using this signal, convince 
yourself that less than two samples/period will not suffice to specify it. 
If the sampling rate 4 is not high enough, what signal would your 


resulting undersampled signal become? Hint: Try the aliasing applet. 


Solution: 


The simplest bandlimited signal is the sine wave. At the Nyquist 
frequency, exactly two samples/period would occur. Reducing the 
sampling rate would result in fewer samples/period, and these samples 
would appear to have arisen from a lower frequency sinusoid. 


Exercise: 
Problem: 
Are the filter h(t) described by the sinc function the only filter we can 


use as a perfect reconstruction filter? If not what are the condition that 
would allow us to use another filter? 


Solution: 


Fill in a solution here 
Exercise: 
Problem: 
If you found that it is possible to use another filter in [link] specify 


such a filter. Hint: Try using the domain which usually simplifies 
things... 


Solution: 


Fill in a solution here 


Exercise: 


Problem: 


What are the difficulties introduced when we want to apply the results 
of this chapter in practice? 


Solution: 


Fill in a solution here 
Exercise: 


Problem: 


If a real signal has frequency content up to fF. What is then the 
bandwith of the signal? 


Solution: 


Fill in a solution here 
Exercise: 


Problem: 


If a real signal has frequency content confined in the interval |—F}, F’| 
. What is then the bandwith of the signal? 


Solution: 


Fill in a solution here 
Exercise: 


Problem: 


What can be said in general for the spectrum of a discrete signal which 
is the result of sampling an analog signal that is NOT bandlimited? 


Solution: 


The spectrum will ALWAYS overlap,there will always be aliasing. 


Exercises related to the Aliasing applet 


Link to the aliasing applet (Right click if you want to open it in a new 
window). 


In the following problems, as in the aliasing applet, we are studying a 
sinusoidal signal, 2(¢) = sin(27 ft), which is sampled at F', = 8000. 
Exercise: 


Problem: 


What is the frequency limitation of an analog sinusoidal signal if we 
want to avoid aliasing, given F’, = 8000? 


Solution: 


With a sampling frequency of 8000 Hz, the maximum frequency of the 
analog signal is 4000 Hz, as given by the sampling theorem. 


Exercise: 
Problem: 
Describe with words the type of signal we "reconstruct" from the 


samples when the input frequency (of the sinusoidal signal) is higher 
than the sample rate can deal with? 


Solution: 


The signal we "reconstruct" is a sinusoidal signal with a frequency that 
is lower than the original because of aliasing. 

Exercise: 
Problem: 


Find an expression the signal we "reconstruct" from the samples when 
the input frequency is 6000 Hz. 


Solution: 


When the input frequency is 6000 Hz, a sampling frequency of 8000 
Hz is to low, i.e aliasing will occur. The sampled signal will have 
frequency components at +6000 Hz and -6000 Hz plus some new 
frequency components as a result of aliasing. 


We know from the proof of the sampling theorem that the sampled 
signal is periodic with Ff’, = 8000. Thus a frequency component at 
6000 Hz implies frequencies at -2000 Hz, -10000 Hz, 14000 Hz and so 
on. Similarly a frequency component at -6000 Hz give rise to(among 
others) a 2000 Hz component. Looking only at the positive frequencies 
the "reconstructed" signal will only have a 2000 Hz frequency 
component. The removal of the 6000 Hz and above frequencies are 
due to the reconstruction filter. The filter is designed based on a 
maximum input signal frequency of 4000 Hz. Thus the "reconstructed" 
signal can be written as: sin(272000¢). 


Exercise: 
Problem: 


Explain the "strange" sample points when the input input frequency is 
4000 Hz. 


Solution: 


The sampled signal can be written as 


a(n) = sin(274000 <7) = sin(n) = 0. Thus all the samples are 


zero-valued. 
Exercise: 
Problem: 


Explain the "strange" sample points when the input input frequency is 
8000 Hz. 


Solution: 


The sampled signal can be written as 


a(n) = sin(278000<*—) = sin(27n) = 0. Thus all the samples are 


zero-valued. 
Exercise: 


Problem: 


Find an expression for the signal we can reconstruct from the samples 
when the input frequency is 4000 Hz. 


Solution: 


As shown in problem 14, the samples are zero valued. A 
reconstructing filter cannot distinguish this from the all zero signal so 
the reconstructed signal will be the all zero signal. 


Note that a small change in the sinusoidal signals phase would produce 
samples that are not only zero-valued. The "reconstructed" signal will 
then be a equal to the original signal. This problem illustrates that 
sampling twice the signals highest frequency component does not 
always guarantee perfect recontstruction. If we could increase the 
sampling frequency to, say, F’; = 8000.00001, we could reconstruct 
the original signal. I.e sampling at a rate greater than twice the highest 
frequency component yields the desired reconstruction. 


Exercise: 
Problem: 


Find an expression for the "reconstructed" signal from the samples 
when the input frequency is 8000 Hz. 


Solution: 


As shown in problem 15, the samples are zero valued. A 
reconstructing filter cannot distinguish this from the all zero signal so 
the reconstructed signal will be the all zero signal. 


Note that a small change in the sinusoidal signals phase would produce 
samples that are not only zero-valued. The "reconstructed" signal will 
then be a signal with aliased components. 


Introduction 


In this and the following modules the basic concepts of information theory will be 
introduced. For simplicity we assume that the signals are time discrete. Time discrete 
signals often arise from sampling a time continous signal. The assumption of time 
discrete signal is valid because we will only be looking at bandlimited signals. (Which 
can, as we know, be perfectly reconstructed). 


In treating time discrete signal and their information content we have to distinguish 
between two types of signals: 


e signals have amplitude levels belonging to a finite set 
e signals that have amplitudes taken from the real line 


In the first case we can measure the information content in terms of entropy, while in the 
second case the entropy is infinte and we must resort to characterise the source by means 
of differential entropy. 


Examples of information sources 


The signals treated are mainly of a stochastic nature, i.e. the signal is unknown to us. 
Since the signal is not known to the destination (because of it's stochastic nature), it is 
then best modeled as a random process, discrete-time or continuous time. Examples of 
information sources that we model as random processes are: 


e Digital data source (e.g. a text) can be modeled as a random process. 

e Video signals can be modeled as a random process. Such signals are mainly 
bandlimited to around 5 MHz (the value depends on the standards used to raster the 
frames of image). 

e Audio signals can be modeled as a random process. Speech is typically between 300 
Hz and 3400 Hz, see [link]. 


SC) 


300 3400 


Power spectral density plot of speech 


Video and speech are analog information signals are bandlimited. Therefore, if sampled 
faster than two times the highest fequency component, they can be reconstructed from 
their sample values. 


Example: 


A speech signal with bandwidth of 3100 Hz can be sampled at the rate of 6.2 KHz. If the 
samples are quantized with a 8 level quantizer then the speech signal can be represented 
with a binary sequence with bit rate 

Equation: 


6200 log, 8 = 18600bits/sec 


Speech signal “A 0011011010111100 


T 


T= =} seconds 
6.2.x 10 


Analog speech signal sampled and quantised 


The sampled real values can be quantized to create a discrete-time discrete-valued 
random process. 


The Core of Information theory 


The key observation from the discussion above is that for a reveiver the signals are 
unknown. It is exact this uncertainty that enables the signal to transmit information. This 
is the core of information theory: 


Note: Information transfer happens when the receiver is unable to know or predict at 
message before it is received. 


Some statistics 


Here we present some statistics with the intent of reviewing a few basic concepts and to 
introduce the notation. 


Let X be a stochastic variable. Let X = x; and X = z; denote two outcomes of X. 


¢ Dependent outcomes implies: 

PrA=o, 4 = 2) SP Aja Pik =27| a = Pil A = 2 PX = 2; | 2 
¢ Independent outcomes implies Pr|X = 2;, X = z,| = Pr[|X = 2;] Pr[|X =2,] 
¢ Bayes' rule: Pr|X = 2; | xj] = Ps 


More about basic probability theory and a derivation of Bayes' rule can be found here. 


Information 


In this module we introduce the concept of self information for an outcome of 
a stochastic variable. 


Example: 

Bergen, Norway is a rainy city. If the locals are "lucky" there is "only" 200 
rainy days in a particular year. Let the random variable Z take the two values: 
"Rain", "No rain". Assuming 200 rainy days a year, we get 

Pr[Z = Rain] = 3% and Pr[Z = No Rain] = 422. We state that 

Z = No Rain carries more information than Z = Rain, the reason is that the 
inhabitans of Bergen expect rain, so whenever it's not raining they are (more) 
surprised. An intuitive definition of an information measure should be larger 
when the probability is small. 


Example: 

The information content in a statement about the temperature and new lottery 
millionaires in Verdal,Norway on a given saturday should be the sum of the 
information on temperature on the particular saturday in Verdal and the 
information of the number of new lucky lottery winners, (under the 
assumption that these observations are independent). Let I denote the 
information of an event, then 

Equation: 


I(temperature, lottery winners) = I(temperature) + I(lottery winners) 


The self information formula 


An intuitive and meaningful measure of self information in an event should 
have the following properties: 


1. The more uncertain you, in advance, are about the outcome, the more 
new information you get by observing the actual outcome, or 
equivalently an event with low probability, p,, has high self information 
I(pn). I(pn) should be a monotonically decreasing function of pp. 

2. Oberserving an event with certain outcome, i.e p, = 1, should give zero 
information. The event p, is then said to have zero self information. 
Since I(p,,) is monotonically decreasing for p,, € [0, 1] this implies that 
the self information can never be less than zero, the observer can never 
lose information by observing an outcome. 

3. If we receive independent messages, the information should accumulate. 
This means that the measure must be additive. 


It can be shown that there only exists one function satisfying the above 
conditions. 


Note: I(p,) = log, = = — logy Pn 


In the above equation the logarithm base can be chosen arbitrary. Usually 
b = 2 is chosen so that the denomination is information bit. The choice 
b = 2 is made to adapt to a digital "world", that is to facilitate electronic 
storage and transmission. 


Representing symbols by bits 


Introduction 


Often we want to represent data, e.g. characters, images, in a binary form. 
By binary form we mean representing by the symbols "0", and "1". Using 
binary representation allows us to conveniently store, retrieve, and 
manipulate them with a computer. To work with data in binary form we 
must have a fixed way of encoding (representing) a fixed data stream. The 
set of all binary sequences in a representation of some data is called a code. 
(Note that this has nothing to do with cryptology). Usually we refer to the 
data that we want to represent by bits as a source. 


Example: 

Representing English Characters 

Let us consider a very practical example of the above ideas. Let our source 
be a stream of English characters. Now we want to represent this stream of 
characters as bits, say to store it on a computer or send it over the Internet. 
First we need to know the number of such characters, which is 
(traditonally) conveniently set to 128. The number 128 is obtained by 
summing upper case charachters (26), lower case (26), digits (10), brackets 
and punctuation (20), odd characters (14) (the "&" is an odd character), 
and control characters (32). 

Obviously we need to have a unique representation of each of the 128 
characters, this can e.g. be obtained by exhausting the 128 bit combinations 
which concatenating 7 bits give. Thus we have devised an 7-bit code. A 
well known 7-bit code is ASCII, short for "American Standard Code for 
Information Interchange". Adding a parity bit for error control to the 
ASCII code forms an 8-bit code. As an example, the representation of an 
"A" in ASCII is 1000001. 

Now, one can ask whether the 7-bit ASCII code is an optimal 
representation in terms of using, on average, the minimum number of bits 
representing the English characters? We will return to this question later (in 
example 3). 


Minimal representation 


When representing a source we want to use as few bits as possible, as this 
will imply that less disk space is required for storage or that transmission 
over the Internet is quicker. However, we do not want to use so few bits that 
the receiver cannot determine what was sent or stored. 


So, for a given source what is the minimal representation? Here we consider 
the minimal representation as the representation that uses the minimum 
number of bits (on average) to encode the source without errors. According 
to Shannon's source coding theorem, a source that produces statistically 
independent outcomes, the minimum average number of bits per symbol is 
the entropy of the source! (A classical example of a source that produces 
Statistically independent outcomes is throwing a die.) 


Average indicates that the number of bits used for a specific symbol may be 
different from the number of bits representing another. E.g., as opposed to 
ASCII coding, we might represent an "A" with 7 bits, but an "E" with 3 
bits. But it also implies that when you receive a series of symbols, the 
number you receive per time unit, say per second, will not be exactly the 
same, but averaged over a long term period, the rate is proportional to time 
with the rate per symbol as the proportionality constant. 


Let us assume that we represent a symbol x», with probability p,,, by l, 
bits. Then, the average number of bits spent per symbol will be 
Equation: 


We see that this equation is equal to the entropy if the code words are 
selected to have the lengths /,, = — log py. Thus, if the source produces 
stochastically independent outcomes with probabilities p,,, such that log py, 
is an integer, then we can easily find an optimal code as we show in the next 
example. 


Example: 

Finding a minimal representation 

A four-symbol alphabet produces stochastically independent outcomes 
with the following probabilities. 


en 5 
Heal = s 
Peel 
Per = ; 


and an entropy of 1.75 bits/symbol. Let's see if we can find a codebook for 
this four-letter alphabet that satisfies the Source Coding Theorem. The 
simplest code to try is known as the simple binary code: convert the 
symbol's index into a binary number and use the same number of bits for 
each symbol by including leading zeros where necessary. 

Equation: 


£1 00%2 Olzxs 6 102%, 6 11 


As all symbols are represented by 2 bits, obviously the average number of 
bits per symbol is 2. Because the entropy equals 1.75 bits, the simple 
binary code is not a minimal representation according to the source coding 
theorem. If we chose a codebook with differing number of bits for the 
symbols, a smaller average number of bits can indeed be obtained. The 
idea is to use shorter bit sequences for the symbols that occur more 
often, i.e., symbols that have a higher probability. One codebook like this 
is 

Equation: 


L106 022 06 10x3 + 11074, 0111 


NowL=1x3+2x+4+3x = +3 x 4 =1.75. We can reach the 


entropy limit! This should come as no surprise, as promised above, when 
log py is an integer for all n, the optimal code is easily found. 

The simple binary code is, in this case, less efficient than the unequal- 
length code. Using the efficient code, we can transmit the symbolic-valued 
signal having this alphabet 12.5% faster. Furthermore, we know that no 
more efficient codebook can be found because of Shannon's source coding 
theorem. 


Example: 

Optimality of the ASCII code 

Let us return to the ASCII codes presented in [link]. Is the 7-bit ASCII 
code optimal, i.e., is it a minimal representation? The 7-bit ASCII code 
assign an equal length (7-bit) to all characters it represents. Thus, it would 
be optimal if all of the 128 characters were equiprobable, that is each 


character should have a probability of = To find out whether the 


characters really are equiprobable an analysis of all English texts would be 
needed. Such an analysis is difficult to do. However, the letter "E" is more 
probable than the letter "Z", so the equiprobable assumption does not hold, 
and the ASCII code is not optimal. 

(A technical note: We should take into account that in English text 
subsequent outcomes are not stochastically independent. To see this, 
assume the first letter to be "b", then it is more probable that the next letter 
is "e", than "z". In the case where the outcomes are not stochastically 
independent, the formulation we have given of Shannon's source coding 
theorem is no longer valid, to fix this, we should replace the entropy with 
the entropy rate, but we will not pursue this here). 


Generating efficient codes 


From Shannon's source coding theorem we know what the minimum 
average rate needed to represent a source is. But other than in the case when 


the logarithm of the probabilities gives an integer, we do not get any 
indications on how to obtain that rate. It is a large area of research to get 
close to the Shannon entropy bound. One clever way to do encoding is the 
Huffman coding scheme. 


Entropy 


The self information gives the information in a single outcome. In most 
cases, e.g in data compression, it is much more interesting to know the 
average information content of a source. This average is given by the 
expected value of the self information with respect to the source's 
probability distribution. This average of self information is called the source 
entropy. 


Definition of entropy 


Entropy 
The entropy (average self information) of a discrete random variable 
X is a function of its probability mass function and is defined as 
Equation: 


A(X) =- > Px(2i) logpx(2) 


where JN is the number of possible values of X and 

Px(a;) = Pr[|X = a;]. If log is base 2 then the unit of entropy is bits 
per (source)symbol. Entropy is a measure of uncertainty in a random 
variable and a measure of information it can reveal. 

If symbol has zero probability, which means it never occurs, it should 
not affect the entropy. Letting 0 x log 0 = 0, we have dealt with that. 


In texts you will find that the argument to the entropy function may vary. 
The two most common are H(X) and H(p). We calculate the entropy of a 
source X, but the entropy is, strictly speaking, a function of the source's 
probabilty function p. So both notations are justified. 


Calculating the binary logarithm 


Most calculators does not allow you to directly calculate the logarithm with 
base 2, so we have to use a logarithm base that most calculators support. 
Fortunately it is easy to convert between different bases. 


Assume you want to calculate log, x, where x > 0. Then logy z = y 
implies that 24 = x. Taking the natural logarithm on both sides we obtain 


In(z) 


Note: log, z = n(2) 


Examples 


Example: 
When throwing a dice, one may ask for the average information conveyed 
in a single throw. Using the formula for entropy we get 


A(X) = —- vans px(a;) log px(a;) = log 6bits/symbol 


Example: 

If a soure produces binary information {0, 1} with probabilities p and 
1 — p. The entropy of the source is 

Equation: 


A(X) = (— (plog, p)) — (1 — p) log, (1 — p) 


If p = O then H(X) = 0, if p = 1 then H(X) = 0, if p = 1/2 then 
H(X) = 1. The source has its largest entropy if p = 1/2 and the source 
provides no new information if p = 0 or p = 1. 


Hp) 


Example: 

An analog source is modeled as a continuous-time random process with 
power spectral density bandlimited to the band between 0 and 4000 Hz. 
The signal is sampled at the Nyquist rate. The sequence of random 
variables, as a result of sampling, are assumed to be independent. The 
samples are quantized to 5 levels {—2, —1,0,1 a is piel ita) of the 
samples taking the quantized values are fi ae = == e ae respectively. 
The entropy of the random variables are 

Equation: 


A(X) —>>?_, px (zi) log px(zi) 
de dg ead 


= bits/sample 


There ats OE samples per second. Therefore, the source produces 
8000 x -2 = 15000 bits/sec of information. 


Entropy is closely tied to source coding. The extent to which a source can 
be compressed is related to its entropy. There are many interpretations 
possible for the entropy of a random variable, including 


e (Average)Self information in a random variable 


e Minimum number of bits per source symbol required to describe the 
random variable without loss 

e Description complexity 

e Measure of uncertainty in a random variable 


References 


e Mien, G.E. and Lundheim,L. (2003) Information Theory, Coding 
and Compression, Trondheim: Tapir Akademisk forlag. 


Differential Entropy 


Consider the entropy of continuous random variables. Whereas the 
(normal) entropy is the entropy of a discrete random variable, the 
differential entropy is the entropy of a continuous random variable. 


Differential Entropy 


Differential entropy 
The differential entropy h(X) of a continuous random variable X with 
a pdf f(z) is defined as 
Equation: 


n(x) =— f f(e)tog fle) da 


Usually the logarithm is taken to be base 2, so that the unit of the 
differential entropy is bits/symbol. Note that is the discrete case, h(X) 
depends only on the pdf of X. Finally, we note that the differential entropy 
is the expected value of — log f(z), ie., 

Equation: 


h(X) = —E(log f(x) 


Now, consider a calculating the differential entropy of some random 
variables. 


Example: 

Consider a uniformly distributed random variable X from c toc + A. 
Then its density is + from cto c+ A, and zero otherwise. 

We can then find its differential entropy as follows, 

Equation: 


h(X) = =f Lion S ais 
= logA 


Note that by making A arbitrarily small, the differential entropy can be 
made arbitrarily negative, while taking A arbitrarily large, the differential 
entropy becomes arbitrarily positive. 


Example: 


Consider a normal distributed random variable X, with mean m and 
7 Lem? 
aD 
2Qro2 . : ; 


We can then find its differential entropy as follows, first calculate 
— log f(z): 
Equation: 


variance o”. Then its density is / 


1 
—log f(x) = 5 log (2707) + loge Ee 


Then since B((X — m)”) — og”, we have 


Equation: 
RX) = (270?) + 4 loge 


(27e07) 


lo 
lo 


dole bole 


S 
gS 


Properties of the differential entropy 
In the section we list some properties of the differential entropy. 


e The differential entropy can be negative 
e h(X +c) = h(X), that is translation does not change the differential 
entropy. 


e h(aX) = h(X) + log |a 
entropy. 


, that is scaling does change the differential 


The first property is seen from both [link] and [link]. The two latter can be 
shown by using [link]. 


Huffman Coding 


One particular source coding algorithm is the Huffman encoding algorithm. 
It is a source coding algorithm which approaches, and sometimes achieves, 
Shannon's bound for source compression. A brief discussion of the 
algorithm is also given in another module. 


Huffman encoding algorithm 


1. Sort source outputs in decreasing order of their probabilities 

2. Merge the two least-probable outputs into a single output whose 
probability is the sum of the corresponding probabilities. 

3. If the number of remaining outputs is more than 2, then go to step 1. 

. Arbitrarily assign 0 and 1 as codewords for the two remaining outputs. 

5. If an output is the result of the merger of two outputs in a preceding 
step, append the current codeword with a 0 and a 1 to obtain the 
codeword the the preceding outputs and repeat step 5. If no output is 
preceded by another output in a preceding step, then stop. 


& 


Example: 
X € {A, B, C, D} with probabilities { $,4,4,4} 
Codeword 
A} | 0 
B 5 01 


Average length = $1 + 52 + =o + ae = = As you may recall, the 
entropy of the source was also H(X) = =. In this case, the Huffman 

bit 
RaeBIE é 


code achieves the lower bound of — 


In general, we can define average code length as 
Equation: 


€=S- px («)é(2) 


LEX 


where X is the set of possible values of z. 
It is not very hard to show that 
Equation: 


H(X) >> H(X)+1 
For compressing single source output at a time, Huffman codes provide 
nearly optimum code lengths. 
The drawbacks of Huffman coding 


1. Codes are variable length. 
2. The algorithm requires the knowledge of the probabilities, px (a) for 


allzxe X. 


Another powerful source coder that does not have the above shortcomings 
is Lempel and Ziv. 


Decibel scale with signal processing applications 


Introduction 


The concept of decibel originates from telephone engineers who were 
working with power loss in a telephone line consisting of cascaded circuits. 
The power loss in each circuit is the ratio of the power in to the power out, 
or equivivalently, the power gain is the ratio of the power out to the power 
in. 


Let P,, be the power input to a telephone line and P,,; the power out. The 
power gain is then given by 
Equation: 


Taking the logarithm of the gain formula we obtain a comparative measure 
called Bel. 


Note: Gain (Bel) = log Ft 


This measure is in honour of Alexander G. Bell, see [link]. 


Alexander G. Bell 


Decibel 


Bel is often a to large quantity, so we define a more useful measure, decibel: 
Equation: 


Pou 
Gain (dB) = 10 log a é 


in 


Please note from the definition that the gain in dB is relative to the input 
power. In general we define: 
Equation: 


Number of decibels = 10 log 


ref 


If no reference level is given it is customary to use P,ep = 1 W, in which 
case we have: 


Note: Number of decibels = 10 log P 


Example: 

Given the power spectrum density (psd) function of a signal x(n), S;x(zf). 
Express the magnitude of the psd in decibels. 

We find S,x(dB) = 10 log |$,x(if)|. 


More about decibels 


Above we’ve calculated the decibel equivalent of power. Power is a 
quadratic variable, whereas voltage and current are linear variables. This 


can be seen, for example, from the formulas P = v and P = I?R. 


So if we want to find the decibel value of a current or voltage, or more 
general an amplitude we use: 
Equation: 
Amplitude 
Amplitude (dB) = 20 log Rises sata 
Amplitude, .¢ 


This is illustrated in the following example. 


Example: 

Express the magnitude of the filter H(if) in dB scale. 
The magnitude is given by |H(zf)|, which gives: 
|H{(dB)| = 20 log |HI(éf)|. 


Plots of the magnitude of an example filter |H(if)| and its decibel 
equivalent are shown in [link]. 


lH| 
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Magnitude responses. 


Some basic arithmetic 


The ratios 1,10,100, 1000 give dB values 0 dB, 10 dB, 20 dB and 30 dB 
respectively. This implies that an increase of 10 dB corresponds to a ratio 
increase by a factor 10. 


This can easily be shown: Given a ratio R we have R[dB] = 10 log R. 
Increasing the ratio by a factor of 10 we have: 10 log (10*R) = 10 log 10 + 
10 log R=10dB+RdaB. 


Another important dB-value is 3dB. This comes from the fact that: 


An increase by a factor 2 gives: an increase of 10 log 2* 3 dB. A 
“increase” by a factor 1/2 gives: an “increase” of 10 log 1/2 * -3 dB. 


Example: 

In filter terminology the cut-off frequency is a term that often appears. 
The cutoff frequency (for lowpass and highpass filters), f;, is the frequency 
at which the squared magnitude response in dB is %. In decibel scale this 
corresponds to about -3 dB. 


Decibels in linear systems 


In signal processing we have the following relations for linear systems: 
Equation: 


Y(if) = H(if)X(if) 


where X and H denotes the input signal and the filter respectively. Taking 
absolute values on both sides of [link] and converting to decibels we get: 


Note: The output amplitude at a given frequency is simply given by the 
sum of the filter gain and the input amplitude, both in dB. 


Other references: 


Above we have used P,e¢ = 1 W as a reference and obtained the standard 
dB measure. In some applications it is more useful to use P,.¢ = 1 mW and 
we then have the dBm measure. 


Another example is when calculating the gain of different antennas. Then it 
is customary to use an isotropic (equal radiation in all directions) antenna as 
a reference. So for a given antenna we can use the dBi measure. (i -> 
isotropic) 


Matlab files 


filter_example.m 


Filter types 


So what is a filter? In general a filter is a device that discriminates, 
according to one or more attributes at its input, what passes through it. One 
example is the colour filter which absorbs light at certain wavelengths. Here 
we shall describe frequency-selective filters. It is called freqency-selective 
because it discriminates among the various frequency compononents of its 
input. By filter design we can create filters that pass signals with frequency 
components in some bands, and attenuates signals with content in other 
frequency bands. 


It is customary to classify filters according to their frequency domain 
charachteristics. In the following we will take a look at: lowpass, highpass, 


bandpass, bandstop, allpass and notch filters. (All of the filters shown are 
discrete-time) 


Ideal filter types 


Lowpass 


Attenuates frequencies above cutoff frequency, letting frequencies below 
cutoff( f,) through, see [link]. 


|H(e!**)| 


fc Lc f 


An ideal lowpass filter. 


Highpass 


Highpass filters stops low frequencies, letting higher frequencies through, 
see [link]. 


IH(e'**)| 


#1 f 1 f 


An ideal highpass filter. 


Bandpass 


Letting through only frequencies in a certain range, see [link]. 


An ideal bandpass filter. 


Bandstop 


Stopping frequencies in a certain range, see [link]. 


H e!2af 
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An ideal bandstop filter. 


Allpass 


Letting all frequencies through, see see [link]. 


j2af 


An ideal allpass filter. 


Does this imply that the allpass filter is useless? The answer is no, because 
it may have effect on the signals phase. A filter is allpass if |H (ern ) | = 1, 
Vf : (f). The allpass filter finds further applications as building blocks for 
many higher order filters. 


Other filter types 


Notch filter 


The notch filter recognized by its perfect nulls in the frequency response, 
see [link]. 
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Notch filter. 


Notch filters have many applications. One of them is in recording systems, 
where the notch filter serve to remove the power-line frequency 50 Hz and 
its harmonics(100 Hz, 150 Hz,...). Some audio equalisers include a notch 
filter. 


Matlab files 


idealFilters.m, notchFilter.m 


Table of Formulas 


Analog 

Delta function 

6(t) = 0 fort 4 0, 

feo) dt=1 

Unit step function 
1if t>0 

ut)= 1) omen 


0 otherwise 


Angular frequency 


Q =2rF 


Energy E, = f™. (\x(t)|)? dt 


Convol. 
y(t) = Leet x(r)h(t —7) drt 


Fourier Transformation 


Time Discrete 


Unit sample 


a(n) = {7 if n=0 


0 otherwise 
Unit step function 

1 if n>0 
u(n) = { a 


0 otherwise 


Angular frequency 


w = Inf 


Energy Bg = S73, (|x(n)|)? 


Power Py = + Sonn. | (\x(n)|)? 


Convol. y(n) = 0 ., e(k)h(n — k) 


Discrete Time Fourier Transform 


Analog 


X(i2) = f@ a(the ™ dt 


Inverse Fourier Transform 


a(t) = tf X(iw)e*™* d 


Fourier coeffecients 


ap = Th Fe a(t)e 0!) dt 


Series expansion 


a(t) = ope, one 


Parseval 


7 ((e(t)|)? dt = De (jaal)? 


Time Discrete 


X (e’”) = Shee a(n)je~") 


Inverse DTFT 


a(n) = xf", X(e“)e" dw 


Discrete Fourier Transform 


X(h) = DN} a(n)e~ 3) 


Inverse DFT 


x(n) = 7 Lipp X(k)er 


Parseval 


4+ ee (X(K)|)? = SA ((e(n)I)? 


Library 


What follows is a collection of links to other Signal processing and 
Information theory resources avaliable. Please report dead links and 
suggestions to links that we should include. 


In addition to these links you should try the Connexions search function 
which allows you to search through all the material in the Connexions 
system. 


Signal processing 


Fundamentals of Electrical Engineering. A comprehensive course availiable 
in Roadmap/Connexions. 


Signals and Systems. A comprehensive course availiable in 
Roadmap/Connexions. 


Complex to Real Basic concepts, Fourier Analysis, ISI, Eye diagram... 


Johns Hopkins University: Signals, Systems and Control Demonstrations. 


Signal Processing Tutorial An impressive collection of Java Applets 
demonstrating various concepts. Recommended. 


Java Digital Signal Processing Editor. The J-DSP Editor, the first on-line 
DSP editor, is used to simulate various DSP techniques. The simulation is 
performed at a high level which gives the "big picture". 


IEEE Signal Processing Society. 


Information Theory 


Information Theory, Inference, and Learning Algorithms. Free book by 
David MacKay of University of Cambridge. 


A short course in Information Theory, by David MacKay of University of 
Cambridge. 


IEEE Information Theory Society . 


